Detecting outliers in multivariate data while controlling false alarm rate

نویسنده

  • André Achim
چکیده

Outlier identification often implies inspecting each z-transformed variable and adding a Mahalanobis D2. Multiple outliers may mask each other by increasing variance estimates. Caroni & Prescott (1992) proposed a multivariate extension of Rosner’s (1983) technique to circumvent masking, taking sample size into account to keep the false alarm risk below, say, α = .05. Simulations studies here compare the single multivariate approach to "multiple-univariate plus multivariate" tests, each at a Bonferroni corrected α level, in terms of power at detecting outliers. Results suggest the former is better only up to about 12 variables. Macros in an Excel spreadsheet implement these techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Ensuring high sensor data quality through use of online outlier detection techniques

Data collected by Wireless Sensor Networks (WSNs) are inherently unreliable. Therefore, to ensure high data quality, secure monitoring, and reliable detection of interesting and critical events, outlier detection mechanisms are needed to be in place. The constraint nature of resources available in WSNs necessities that unlike traditional outlier detection techniques performed off-line, outliers...

متن کامل

Robust distances for outlier-free goodness-of-fit testing

Robust distances are mainly used for the purpose of detecting multivariate outliers. The precise definition of cut-off values for formal outlier testing assumes that the “good” part of the data comes from a multivariate normal population. Robust distances also provide valuable information on the units not declared to be outliers and, under mild regularity conditions, they can be used to test th...

متن کامل

Anomaly Detection for Hypaerspectral Imagery Using Analytical Fusion and RX

Anomaly detection is attractive for the analysis of hyperpectral imagery. This paper describes an expanded anomaly detection algorithm for small targets in hyperspectral imagery. As a variant of the well known multivariate anomaly detector called RX algorithm, the approach called the dimension reduction RX algorithm (DRRX) is proposed. The analytical fusion strategy is incorporated into the RX ...

متن کامل

Using labeled data to evaluate change detectors in a multivariate streaming environment

We consider the problem of detecting changes in a multivariate data stream. A change detector is defined by a detection algorithm and an alarm threshold. A detection algorithm maps the stream of input vectors into a univariate detection stream. The detector signals a change when the detection stream exceeds the chosen alarm threshold. We consider two aspects of the problem: (1) setting the alar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012